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I.  INTRODUCTION 

Despite  all  the  recent  advances  in  the  areas  of  communications  and  data  coding, 
there  are  still  a  large  number  of  applications  where  the  achievable  data  rate  is  not 
sufficient  to  the  task.  There  also  exists  an  even  larger  number  of  tasks  for  which  an 
improvement  in  data  compression  would  enable  us  to  do  the  job  better  or  more  effi- 
ciently. Two  prime  examples  of  the  types  of  data  for  which  better  coding  is  desirable 
are  digital  speech  and  image  data.  Both  of  these  data  types  require  an  extremely  high 
data  rate  for  real  time  transmission.  Both  also  display  a  wealth  of  internal  structure 
that  can  be  utilized  for  compression  by  a  coding  system.  Finally  both  of  these  types 
of  signals  can  be  transmitted  with  a  certain  amount  of  distortion  and  still  provide 
the  required  information.  For  example,  it  may  be  sufficient  to  maintain  intelligibility 
for  speech  data,  and  it  may  be  sufficient  for  an  image  to  display  enough  detail  for  an 
analyst  to  recognize  certain  key  features  rather  than  a  faithful  bit  by  bit  reproduc- 
tion. This  is  in  contrast  to  many  other  types  of  digital  data  for  which  our  principle 
interest  is  to  add  error  correcting  capability  until  the  probability  of  a  single  bit  error 
is  vanishingly  small.  All  these  factors  combine  to  make  improved  compression  tech- 
niques for  speech  and  image  data  a  worthy  goal  and  thus  an  active  area  of  research 
in  digital  signal  processing. 

In  signals  for  which  we  can  tolerate  some  distortion,  there  must  be  some  method 
for  measuring  the  distortion  relative  to  the  original  signal.  These  fall  into  the  two  basic 
categories  of  subjective  distortion  measures  and  objective  distortion  measures.  The 
subjective  measures  are  a  result  of  human  impressions  of  the  comparison  between 
original  and  distorted  versions;  while  the  objective  measure  has  some  closed  form 
mathematical  expression  by  which  we  can  compare  competing  systems.  For  our  study 


we  desire  a  data  type  that  has  a  simple  objective  measure  that  corresponds  well  to  the 
results  of  subjective  measures.  Fortunately,  for  image  data  there  exists  a  distortion 
measure,  mean  square  error,  which  is  both  easy  to  calculate  and  corresponds  fairly  well 
to  subjective  distortion  results.  Thus  in  this  thesis,  we  concentrate  on  the  compression 
of  image  data. 

There  are  many  schemes  for  compressing  image  data,  but  few  have  been  suc- 
cessful in  producing  good  images  quality  at  low  data  rates.  Generally,  images  are 
coded  with  each  pixel  assigned  a  grey  level  from  0  to  255.  This  corresponds  to  a  data 
rate  of  8  bits/pixel.  Several  examples  showing  how  three  common  coding  techniques 
perform  at  low  data  rates  are  shown  in  the  following  figures.  First,  we  examine  the 
technique  of  scalar  quantization,  in  which  we  map  the  256  grey  levels  into  a  smaller 
number,  which  can  then  be  transmitted  using  a  smaller  number  of  bits.  Figure  1.1 
shows  the  original  at  8  bits/pixel,  Figure  1.2  shows  scalar  quantization  at  a  data  rate 
of  4  bits/pixel,  Figure  1.3  shows  a  data  rate  of  2  bits  /pixel,  and  figure  1.4  shows  a 
data  rate  of  1  bit/pixel.  Clearly  this  technique  produces  poor  results  below  about 
4  bits/pixel.  Second,  we  examine  the  technique  of  delta  modulation,  in  which  we 
encode  the  difference  between  the  current  and  previous  pixels  using  a  raster  scan. 
Figure  1.5  shows  the  original  image,  Figure  1.6  shows  delta  modulation  at  a  data  rate 
of  4  bits/pixel,  Figure  1.7  shows  a  data  rate  of  2  bits/pixel,  and  Figure  1.8  shows 
a  data  rate  of  1  bit /pixel.  While  delta  modulation  is  an  improvement  over  scalar 
quantization,  it  tends  to  perform  poorly  below  about  2  bits/pixel.  Next,  we  examine 
a  transform  technique,  the  two  dimensional  fast  fourier  transform  (2-D  FFT).  Fig- 
ure 1.9  shows  the  original  image,  Figure  1.10  shows  the  2-D  FFT  at  a  data  rate  of 
4  bits/pixel,  Figure  1.11  shows  a  data  rate  of  2  bits/pixel,  and  Figure  1.12  shows  a 
data  rate  of  1  bit/pixel.  This  technique  offers  fairly  good  reproduction  down  to  2 
bits/pixel. 


Figure  1.1  Original  Image 


Figure  1.2  Scalar  Quantization 
at  4  bits/pixel 


Figure  1.3  Scalar  Quantization 
at  2  bits/pixel 


Figure  1.4  Scalar  Quantization 
at  1  bit/pixel 
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Figure  1.5  Original  Image 


Figure  1.6  Delta  Modulation 
at  4  bits/pixel 


Figure  1.7  Delta  Modulation 
at  2  bits/pixel 


Figure  1.8  Delta  Modulation 
at  1  bit/pixel 


Figure  1.9  Original  Image 


Figure  1.10  2-D  FFT  at  4  bits/pixel 


Figure  1 . 1 1  2-D  FFT  at  2  bits/pixel 


Figure  1.12  2-D  FFT  at  I  bit/pixel 


Now  we  compare  the  previous  techniques  to  a  more  powerful  method,  vector 
quantization.  In  vector  quantization,  we  encode  an  entire  block  of  data  using  a  single 
codeword.  This  codeword  is  produced  by  comparing  the  block  to  be  encoded  with  a 
codebook  of  example  blocks  and  choosing  the  example  which  is  closest  in  some  sense. 
A  comparison  of  vector  quantization  and  the  previous  three  techniques  is  made  in  the 
following  figures  at  a  data  rate  of  1  bit/pixel.  Figure  1.13  shows  scalar  quantization 
at  1  bit/pixel,  Figure  1.14  shows  delta  modulation  at  1  bit/pixel,  Figure  1.15  shows 
the  2-D  FFT  at  1  bit/pixel,  and  Figure  1.16  shows  vector  quantization  at  1  bit/pixel. 
Clearly  the  technique  of  vector  quantization  is  superior  at  low  data  rates.  There 
exist  other  coding  techiques  [Ref.  1]  such  as  linear  predictive  coding  and  the  discrete 
cosine  transform  which  have  been  successful  in  image  coding,  but  were  not  considered 
in  these  examples  for  the  sake  of  brevity. 

A.      THESIS  OBJECTIVE 

Vector  quantization  has  not  been  commonly  used  because  of  the  large  compu- 
tational cost  involved  in  generating  the  codebook  and  finding  the  closest  codeword 
for  each  block  to  be  transmitted.  Recently,  a  revived  interest  in  research  in  neural 
networks  has  shown  some  promise  for  an  efficient  implementation  of  vector  quanti- 
zation. In  this  task  we  benefit  from  the  neural  network's  ability  to  quickly  perform 
categorizations  (which  accelerates  the  codebook  generation),  and  also  from  the  par- 
allel processing  capability  (which  speeds  the  process  of  comparing  an  input  block 
to  the  codebook).  This  thesis  concentrates  on  addressing  the  difficulties  in  imple- 
menting vector  quantization  using  neural  networks.  Full  Search,  tree  search,  and 
multistage  VQ  schemes  are  studied  for  this  purpose.  Results  of  the  application  of 
these  techniques  to  image  data  are  presented. 


Figure  1.13  Scalar  Quantization 
at  1  bit/pixel 


Figure  1.14  Delta  Modulation 
at  1  bit/pixel 


Figure  1.15  2-D  FFT  at  1  bit/pixel 


Figure  1.16  Vector  Quantization 
at  1  bit/pixel 


B.     THESIS  OUTLINE 

In  the  second  chapter  we  describe  the  basic  theory  of  vector  quantization,  in- 
troduce the  existing  VQ  algorithms,  and  present  some  simple  examples  of  how  a 
codebook  is  generated.  In  the  third  chapter  we  introduce  the  basic  concepts  of  neural 
networks,  discuss  the  types  of  neural  network  learning,  and  present  the  algorithms 
which  can  be  applied  to  the  problem  of  vector  quantization.  In  the  fourth  chapter  we 
identify  the  shortcomings  of  existing  neural  network  vector  quantizers,  and  apply  the 
tree  search,  multi  stage  and  classification  vector  quantization  schemes  in  an  effort  to 
improve  performance. 


II.  VECTOR  QUANTIZATION 

A.  INTRODUCTION 

One  of  the  results  of  Shannon's  rate-distortion  theory  [Ref.  3]  is  that  better 
results  can  always  be  obtained  if  vectors  are  used  in  coding  rather  than  scalars. 
This  result  applies  even  if  some  technique  has  been  applied  to  the  input  data  to 
remove  all  correlation.  Although  delta  modulation  and  transform  methods  provide 
substantial  improvement  over  scalar  quantization,  they  all  use  scalar  coding  and  are 
thus  suboptimal.  As  we  saw  in  the  examples  presented  previously,  vector  quantization 
provides  a  dramatic  improvement  in  reproduction  quality  for  low  data  rates.  In  this 
chapter  we  examine  the  technique  of  vector  quantization  as  it  applies  to  images  and 
review  existing  methods  of  implementation. 

B.  DETAILS  OF  THE  METHOD 

Vector  Quantization  (VQ)  [Ref.  2]  consists  of  two  sets  of  mappings:  an  encoder, 
7(x),  which  assigns  a  channel  codeword,  u  =  («i,  u2, . . . ,  up),  to  each  input  vector, 
x  =  (xo,xi, . . . ,  sjfc_i),  from  a  set  of  possible  channel  symbols  called  a  codebook,  and 
a  decoder,  /3(u),  which  assigns  a  code  vector,  y,  to  each  channel  codeword.  Note  that 
each  input  vector  is  just  a  vector  version  of  a  block  from  the  subject  image  with  each 
pixel  value  corresponding  to  an  element  in  the  vector.  The  channel  symbols  consist 
of  all  possible  binary  p-tuples,  where  the  size  of  the  input  vector  and  the  channel 
codeword  are  in  general  not  the  same.  The  number  of  possible  channel  codewords  is 
2" ,  and  thus  the  bit  rate  of  the  vector  quantizer  is  p  bits/block  or  r  =  p/k  bits/pixel. 
It  is  interesting  to  note  that  by  properly  selecting  p  and  k,  we  can  generate  any 
fractional  bit  rate  that  we  desire.    This  is  in  contrast  to  scalar  quantization  where 


we  are  limited  to  integer  bit  rates.    Figure  2.1  shows  the  basic  structure  of  a  vector 
quantizer  system. 
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Figure  2.1:  Vector  Quantization 

The  basic  goal  of  the  vector  quantizer  design  is  to  discover  which  specific  set  of 
encoder  and  decoder  mappings  will  give  the  best  reproduction  of  the  image  in  some 
sense.  This  depends  upon  the  cost  function  or  distortion  which  we  use  to  measure 
the  quality  of  the  output  image.  We  wish  to  find  a  distortion  or  distance  measure 
between  our  input  image,  X,  and  the  output  image,  X,  that  is  easy  to  compute  and 
provides  good  correspondence  with  subjective  image  quality.  The  measure  chosen  for 
this  work  is  the  mean  square  error  (MSE)  which  is  defined  as 


N     N 


e  =  d(X,X)  =  ^||X-X||2  ^ED^-xof 


(2.1) 


where  N  is  the  number  of  pixels  along  one  side  of  the  image. 

We  will  now  examine  the  conditions  under  which  the  vector  quantizer  will  be 
optimal.  First  we  define  the  set  of  all  possible  code  vectors,  y,  as  C  =  [y  :  V  y  £ 
/i(u)].  This  set  of  code  vectors  together  with  the  corresponding  codewords  is  called 
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the  codebook.  For  the  vector  quantizer  to  be  optimal  it  must  display  two  properties 
[Ref.  5]. 

First,  the  encoder  must  select  the  code  vector  which  is  closest  to  the  input 
vector  according  to  the  distortion  measure.  In  our  case  this  is  the  mean  square  error. 
This  can  be  stated  as 

d(x,^[7(x)])  =  min<Z[x,/?(u)]  =  mind(x,y).  (2.2) 

u  yeC 

Thus  the  encoder  can  be  thought  of  as  a  device  which  partitions  the  input  vector 
space  into  sections  which  surround  a  code  vector.  Any  input  which  falls  into  that 
section  will  be  assigned  the  codeword  corresponding  the  code  vector  contained  in  that 
section.  The  encoder  can  also  be  viewed  as  a  device  which  divides  the  input  vector 
space  into  a  group  of  sections  for  which  all  input  vectors  occuring  in  a  section  are 
grouped  together  and  transmitted  as  a  single  representative  code  vector. 

Second,  for  an  encoder  7,  the  decoder  must  assign  as  the  code  vector  the  gener- 
alized centroid  of  all  the  vectors  which  are  encoded  into  that  code  word.  In  our  case 
the  centroid  can  be  expressed 

y  =  0(u)  =  cent(u)  =  -L       £      x.  (9.3) 

z(u)        ,    . 

where  i(u)  is  the  number  of  input  vectors  that  are  mapped  to  u.  This  is  the  selection 
of  the  code  vector  which  will  minimize  the  distortion,  E[d(x,y)  |  7(x)  =  u]  for  a 
particular  encoder. 

If  we  carefully  examine  the  previous  two  properties,  we  can  see  that  the  first 
gives  us  a  method  to  optimize  an  encoder  for  a  given  decoder,  and  the  second  gives  us  a 
method  to  optimize  a  decoder  for  a  given  encoder.  This  suggests  an  iterative  technique 
of  applying  these  two  properties  successively  until  convergence  is  obtained  or  some 
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desired  distortion  level  is  reached.  This  is  in  fact  the  basis  for  the  generalization  of 
Lloyd's  optimal  scalar  quantizer  [Ref.  4],  which  was  produced  by  Linde.  Buzo  and 
Gray  [Ref.  6]  and  is  referred  to  as  the  LBG  algorithm.  The  algorithm  consists  of  the 
following  steps: 

•  Step  1  Choose  an  initial  decoder. 

•  Step  2  Encode  the  image  using  the  given  decoder  (optimize  the  encoder  as  in 
the  first  property).  If  the  distortion  is  small  enough,  terminate  the  algorithm. 

•  Step  3  For  each  codeword  u  replace  the  corresponding  code  vector  with  the 
centroid  of  all  input  vectors  that  mapped  to  u  in  the  encoder  produced  by  step 
1  (optimize  the  decoder  as  in  the  second  property).  Then  repeat  step  2. 

The  last  detail  to  be  addressed  is  the  selection  of  the  initial  decoder.  Clearly  in 
an  iterative  technique  such  as  this,  a  good  initial  selection  can  make  a  large  difference 
in  the  number  of  iterations  required  for  convergence  to  the  final  result.  Several  tech- 
niques have  been  employed  (See  for  example  [Ref.  7]).  First,  we  can  just  select  the 
appropriate  number  of  input  vectors  from  the  image  and  use  them  as  the  code  vec- 
tors in  the  initial  codebook.  Second,  we  can  apply  a  scalar  quantizer  to  each  element 
of  the  vector  and  generate  the  number  of  values  needed  to  form  the  code  vectors. 
Lastly,  we  can  use  a  technique  known  as  splitting.  In  this  technique  we  start  with  a 
codebook  of  size  1,  which  is  just  the  centroid  of  the  entire  data  set.  Then  we  split  the 
code  vector  by  adding  and  subtracting  a  small  vector  from  the  original  code  vector 
and  optimize  this  new  codebook  of  size  two.  Then  we  split  the  two  resulting  code 
vectors  into  a  codebook  of  size  4  and  optimize  this  code  book  as  well.  We  continue 
this  pattern  of  splitting  and  optimization  until  the  desired  codebook  size  is  reached. 
Of  the  initialization  techniques  described  above,  the  splitting  technique  is  most  often 
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used  because  it  initializes  each  step  with  a  good  initial  guess  which  limits  the  number 
of  iterations  required. 

Now  we  present  a  two  dimensional  example  to  give  an  intuitive  feel  for  how  the 
splitting  algorithm  progresses.  The  data  set  for  the  example  is  presented  in  Figure  2.2. 
This  data  set  has  been  chosen  to  have  a  simple  structure  in  order  to  eliminate  the 
need  for  many  iterations  at  each  splitting.  In  this  example  we  attempt  to  generate  a 
vector  quantizer  of  size  four.  Step  1  is  to  form  the  codebook  of  size  one  by  calculating 
the  centroid  of  the  entire  data  set  (See  Figure  2.3).  Step  2  is  to  split  the  size  one 
codebook  into  a  size  two  codebook  by  adding  and  subtracting  a  small  vector  (See 
Figure  2.4).  Step  3  is  finding  the  vector  subspaces  which  define  the  decision  areas  for 
the  new  codebook  (See  Figure  2.5).  Step  4  is  to  calculate  the  centroid  of  each  of  the 
new  vector  subspaces  and  make  these  centroids  the  new  code  vectors  (See  Figure  2.6). 
Step  5  is  to  split  the  newly  generated  size  two  codebook  into  a  size  four  codebook  by 
adding  and  subtracting  a  small  vector  from  each  code  vector  (See  Figure  2.7).  Step 
6  is  to  calculate  new  vector  subspaces  for  each  of  the  code  vectors  (See  Figure  2.8). 
Step  7  is  to  calculate  the  centroid  of  each  new  subspace  and  assign  them  as  code 
vectors  (See  Figure  2.9).  Finally,  step  8  is  to  find  the  vector  subspaces  corresponding 
to  the  decision  regions  for  our  final  codebook  (See  Figure  2.10)  and  the  algorithm  is 
complete. 

For  this  example  it  is  easy  to  see  where  the  code  vectors  for  a  vector  quantizer 
of  four  should  be  placed;  one  at  the  centroid  of  each  of  the  four  clusters.  This  is  the 
result  produced  by  the  LBG  algorithm.  However,  we  must  keep  in  mind  that  this 
example  was  carefully  contrived  to  eliminate  the  large  number  of  iterations  required 
for  optimization  at  each  splitting.  A  problem  from  a  real  data  set,  even  if  the  di- 
mension and  size  are  the  same  as  our  example,  would  be  much  more  computationally 
expensive. 
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Figure  2.3  Step  1,  Centroid  of  Data  Set 
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Figure  2.4  Step  2,  First  Point  Splitting 
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Figure  2.7  Step  5,  Second  Point  Splitting 
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Figure  2.8  Step  6,  New  Subspaces 
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Figure  2.9  Step  7,  New  Centroids 
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Figure  2.10  New  Subspaces,  Algorithm  Complete 
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III.  NEURAL  NETWORKS 

A.      INTRODUCTION 

Artificial  Neural  Networks  [Ref.  8]  have  recently  been  the  subject  of  intense 
research  because  of  a  desire  to  develop  machines  which  can  achieve  human-like  per- 
formance in  such  areas  as  speech  and  image  recognition.  After  a  lengthy  period  of 
inactivity  in  this  area,  the  recent  development  of  new  algorithms,  advances  in  ana- 
log VLSI  techniques,  and  a  new  emphasis  on  parallel  computing  have  contributed  to 
major  advances  in  this  field. 

Like  their  biological  counterparts,  neural  networks  rely  on  a  large  collection  of 
simple  but  highly  connected  processing  elements.  This  enables  the  neural  network  to 
avoid  the  sequential  instruction  processing  characteristic  of  the  von  Neumann  com- 
puter, and  instead  process  many  possible  results  in  parallel.  This  property  makes 
a  neural  network  an  attractive  option  to  investigate  in  many  recognition  problems. 
Neural  networks  are  also  designed  to  adaptively  update  the  interconnection  weights 
between  processing  elements  in  an  effort  to  improve  their  performance.  This  adaptive 
updating  is  termed  "learning."  This  property  allows  a  neural  network  to  continue  to 
function  well  despite  changes  in  the  statistics  of  the  input  data. 

A  neural  network  is  a  good  tool  in  pattern  recognition  because  of  its  ability  to 
quickly  categorize  an  input  pattern  in  a  previously  learned  category.  However,  there 
also  exist  different  algorithms  which  are  equally  proficient  at  taking  a  data  set  and 
forming  the  occurring  patterns  into  categories  without  supervision.  That  is,  without 
external  definition  of  the  categories  to  be  used  by  the  neural  network.  Thus  with 
some  modifications,  a  neural  network  can  be  made  to  do  a  task  which  is  very  similar 
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to  vector  quantization.  If  we  can  find  the  proper  way  to  update  the  interconnection 
weights  and  the  proper  function  for  the  processing  elements,  we  should  be  able  to  find 
a  configuration  that  is  capable  of  duplicating  the  results  of  the  LBG  algorithm  which 
we  saw  in  the  previous  chapter.  In  the  following  sections  we  will  briefly  discuss  the 
difference  between  unsupervised  and  supervised  learning,  and  how  neural  networks 
can  be  applied  to  the  problem  of  vector  quantization. 

B.      NEURAL  NETWORK  LEARNING 

Each  processing  element  of  the  neural  network  is  connected  to  many  inputs 
x  =  (,To,  *n, .  •  • ,  xn-i)  (See  Figure  3.1).  These  inputs  could  originate  directly  from 
the  input  to  the  network,  or  some  or  all  could  arrive  from  the  output  of  another 
processing  element.  Each  input  to  the  processing  element  has  an  associated  weight 
Wi,  which  describes  the  strength  of  the  connection  between  the  associated  input  node 
and  this  processing  element.  Each  processing  element  has  an  activation  level  which 
is  a  function  of  the  inputs  and  weights.  One  of  the  most  common  activation  formulas 
is 

y  =  f(f^wixi-$)  (3.1) 

t=0 

where  6  is  some  threshold.  This  is  just  a  weighted  sum  which  is  thresholded  and 
subjected  to  a  function  /,  which  is  usually  nonlinear. 

Typical  neural  networks  are  made  up  of  many  of  these  processing  elements 
which  are  arranged  and  interconnected  in  some  pattern.  This  pattern,  the  activation 
formula  discussed  above,  and  the  scheme  for  adaptively  updating  the  weights  for  each 
processing  element  are  the  items  which  characterize  each  type  of  neural  network. 

A  final  property  which  characterizes  a  neural  network  is  the  manner  in  which 
it  is  trained.  There  are  two  main  categories,  namely  supervised  and  unsupervised. 
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1.      SUPERVISED  LEARNING 

In  supervised  learning,  the  neural  net  is  provided  a  set  of  desired  output 
values  for  each  set  of  input  values  presented  to  the  network.  These  desired  output 
values  are  used  in  order  to  update  the  interconnection  weights.  A  good  example 
of  a  neural  network  which  uses  supervised  learning  is  the  backpropagation  network 
[Ref.  9],  which  is  arranged  as  in  Figure  3.2.  The  activation  function  for  a  typical 
implementation  of  backpropagation  algorithm  is 

«°>  =  i  +  e-(»-»»  <3'2> 

which  is  known  as  a  sigmoid  logistic  function.  The  training  of  the  network  proceeds 
as  follows: 

•  Step  1  Initialize  the  interconnection  weights  to  small  random  values. 

•  Step  2  Present  a  set  of  input  values  and  corresponding  desired  output  values 
to  the  network. 

•  Step   3   Apply  the  activation  formula  for  each  processing  element  until  the 
output  values  have  been  calculated. 

•  Step  4  Update  the  interconnection  weights  starting  with  the  output  layer  and 
moving  downwards  using  the  formula 

Wij(t  +  1)  =  W{j(t)  +  T]SjXt  (3.3) 

where  Wij  is  the  interconnection  between  node  i  of  the  previous  layer  and  node 
j  of  the  current  layer,  x,  is  the  activation  level  of  node  i  ,  rj  is  the  learning  rate. 
The  backpropagated  error  is 
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s     _  (  y;(l  -  yj)(dj  -  Vj)      for  an  output  node 

1    Xj(l  —  Xj)  Yik  SkWjk    f°r  an  intermediate  node 

where  ijj  is  the  activation  level  of  node  j  on  the  current  layer,  and  d3  is  the 
desired  output  for  node  j. 

Steps  2-4  are  repeated  until  the  network  weights  have  converged  or  the 
error  between  the  output  and  desired  signals  is  sufficiently  low.  This  method  works 
well  for  a  case  such  as  speech  recognition  where  we  can  collect  a  large  quantity  of 
sample  data  with  the  correct  classification  appended  to  allow  training  of  the  network. 
However  a  network  of  this  sort  is  of  little  use  for  a  problem  like  vector  quantization 
in  which  the  neural  network  must  form  the  desired  categories  without  any  external 
guidance. 

2.      UNSUPERVISED  LEARNING 

A  good  example  of  a  neural  network  algorithm  that  utilizes  unsupervised 
learning  is  the  competitive  learning  network  shown  in  Figure  3.3.  This  algorithm  is 
designed  to  take  the  set  of  input  vectors  and  use  them  to  form  a  set  of  categories; 
one  category  for  each  node  on  the  second  level  of  the  network.  This  is  accomplished 
by  measuring  the  proximity  of  each  input  vector  to  the  set  of  weights  for  each  node 
on  the  second  level  and  adaptively  adjusting  the  weights  of  the  closest  node  towards 
the  input  vector.  After  sufficient  training,  the  network  should  categorize  all  input 
vectors  which  are  similar  into  the  same  category  based  on  their  distance  from  the 
weight  vector  of  each  node. 

The  training  of  the  competitive  learning  algorithm  proceeds  as  follows. 

•  Step  1  Initialize  the  weights  from  the  N  input  nodes  to  the  M  output  nodes 
with  small  random  numbers. 

•  Step  2  Present  an  input  vector  from  the  data  set. 
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•  Step  3  Compute  the  distance  dj,  between  the  input  vector  and  the  weights  of 
each  output  node  j  using  the  formula 

dj=  £N*)-«>y(*)l2  (3.5) 

(=0 

where  xt(t)  is  the  input  to  node  i  at  time  t,  and  Wij(t)  is  the  interconnection 
weight  from  input  node  i  to  output  node  j  at  time  t.  Note  that  this  distance 
measure  is  just  the  unnormalized  MSE  between  the  input  vector  and  the  weight 
vector  of  output  node  j. 

•  Step  4    Select  the  output  node  j"  which  is  closest  to  the  input  vector. 

•  Step  5  Update  the  weights  of  the  closest  output  node  j"  using  the  expression 

u>ij{t  +  1)  =  Wij(t)  +  ri(t)(xi(t)  -  wXJ{t))  (3.6) 

where  rj{t)  is  the  time  dependent  learning  rate. 

•  Step  6  Get  the  next  input  vector  and  return  to  step  2. 

We  continue  training  the  network  until  convergence  is  obtained  or  the  average  error 
for  the  entire  data  set  is  less  than  some  threshold  value. 

It  is  not  hard  to  see  the  resemblance  between  vector  quantization  and  the 
task  performed  by  the  competitive  learning  algorithm.  To  implement  VQ,  we  just 
present  each  block  of  the  image  as  an  input  vector  and  train  the  neural  network  until 
it  converges.  Then  the  weight  vectors  produced  for  each  output  node  are  the  code 
vectors  for  the  VQ  codebook,  and  the  indices  of  the  output  nodes  are  the  correspond- 
ing codewords.  After  training,  the  weights  are  fixed  and  the  codebook  is  transmitted 
to  the  receiving  site.   Then  each  block  to  be  transmitted  is  submitted  to  the  neural 
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network.  The  index  of  the  closest  output  node  to  the  input  is  transmitted  as  the  VQ 
codeword.  At  the  receiving  site,  the  codeword  is  used  as  the  argument  in  a  lookup 
table  in  which  the  codewords  and  the  corresponding  code  vectors  are  stored.  The 
code  vector  chosen  is  then  converted  into  an  image  block  which  serves  as  an  approxi- 
mation to  the  original  block.  After  the  codewords  for  all  the  blocks  in  the  image  are 
transmitted  and  decoded,  the  final  reproduction  image  is  assembled  from  the  code 
vector  approximations. 

The  competitive  learning  algorithm  is  now  applied  to  the  two  dimensional 
VQ  example  presented  in  the  previous  chapter.  The  trajectories  of  the  code  vectors 
are  presented  in  Figure  3.4.  Notice  that  the  algorithm  attempts  to  represent  the 
data  with  a  single  code  vector.  This  occurs  because  the  code  vector  that  is  closest 
for  the  first  input  vector  continues  to  be  the  closest  for  all  subsequent  input  vectors. 
Thus  none  of  the  other  code  vectors  are  ever  utilized  and  their  weights  are  never 
updated.  An  algorithm  such  as  this  clearly  does  not  utilize  all  its  code  vectors  and 
thus  cannot  produce  an  optimum  vector  quantizer.  In  the  next  section  we  examine 
modifications  to  the  competitive  learning  algorithm  which  improve  its  performance 
as  a  vector  quantizer. 

C.     FREQUENCY  SENSITIVE  COMPETITIVE  LEARNING 

As  shown  in  the  previous  section,  the  principal  problem  with  using  competitive 
learning  as  a  vector  quantizer  is  the  under-utilization  of  the  output  nodes.  This 
problem  has  been  addressed  in  the  literature  and  several  possible  solutions  have  been 
presented.  In  [Ref.  10],  an  algorithm  referred  to  as  the  Self  Organizing  Map  (SOM) 
is  introduced.  In  the  SOM,  a  neighborhood  is  defined  about  the  closest  output  node 
and  this  neighborhood  is  used  to  update  more  than  one  output  node  at  a  time.  In 
this  technique  the  update  formula  becomes 
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Figure  3.4  Competitive  Learning  2-D  Example 
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Wij(t  +  1)  =  waW  +  riitftxjW-Wijit)],  j  eAf(j'.t)  (3.7) 

where  /"  is  the  index  of  the  closest  output  node  and  Af  is  the  neighborhood  defined 
about  the  closest  output  node.  This  neighborhood  is  started  with  a  large  size  to 
encourage  the  updating  of  many  output  nodes,  and  then  gradually  shrunk  with  time 
as  the  network  converges  to  generate  more  fine  structure.  Finally,  the  neighborhood 
shrinks  to  a  single  node  which  allows  each  node  to  be  updated  independently.  At  this 
point  the  SOM  algorithm  is  identical  to  the  original  competitive  learning.  We  can 
see  that  the  improvement  in  output  node  utilization  comes  from  establishing  a  good 
distribution  of  weight  vectors  throughout  the  input  vector  space  and  then  allowing 
the  network  to  converge.  The  drawback  of  this  technique  is  that  the  resulting  network 
takes  an  excessive  number  of  iterations  to  reach  convergence. 

Another  technique  termed  adding  a  conscience  to  competitive  learning  is  pre- 
sented in  [Ref.  11].  In  this  algorithm  we  generate  a  new  variable,  p^,  for  each  output 
node  which  represents  the  percentage  of  the  time  that  a  particular  node  is  the  closest 
to  the  input  vector.  This  variable  is  initialized  to  zero  and  updated  by  : 

„„,„       /  (l-B)p,(t)  +  B    lorj=j- 
»('+1>-i(l-fl)W(t)  for  j*;-  (3'8) 

where  B  is  a  constant  which  is  chosen  small  enough  to  prevent  random  fluctuation  in 

the  input  data  from  having  too  large  an  effect  on  p:.  Then  a  bias  term,  6j,  is  calculated 

using  b}  =  C(l/M  —  p^),  where  C  is  termed  the  bias  constant.  This  bias  term  is  then 

applied  to  the  distance  measure  for  each  output  node,  and  the  closest  node  is  chosen 

based  on  this  biased  distance,  d-j  —  b:.  The  result  of  these  modifications  is  to  penalize 

the  output  nodes  that  have  won  the  competition  frequently.    This  produces  a  very 

uniform  output  node  utilization.    This  algorithm  has  the  advantage  of  converging 
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quickly  while  maintaining  good  output  node  utilization,  but  requires  twice  as  many 
distance  calculations  as  the  original  competitive  learning  algorithm. 

A  variation  on  the  conscience  technique  discussed  above  is  Frequency  Sensitive 
Competitive  Learning  (FSCL)  [Ref.  12].  In  this  algorithm  the  distance  dx,  between 
the  input  vector  and  the  output  node  weight  vector  is  modified  by: 

d*=dtg(ut)  (3.9) 

where  ut  is  the  number  of  times  the  output  node  i  has  won  the  competition  and  g  is 
termed  the  fairness  function,  with  g{ut)  =  u,-  in  most  cases.  The  effect  of  this  mod- 
ification is  to  increase  the  modified  distance  for  those  nodes  which  win  frequently. 
Over  many  training  iterations,  the  result  is  a  remarkably  even  node  utilization.  This 
algorithm  preserves  the  fast  convergence  of  the  conscience  method  and  also  requires 
us  to  update  only  one  set  of  weights  for  each  input  vector.  In  addition,  the  algorithm 
requires  only  one  set  of  distance  calculations  and  is  thus  much  faster  than  the  con- 
science method.  The  FSCL  vector  quantizer  is  the  basic  building  block  which  will  be 
used  in  the  algorithms  in  the  next  chapter. 

We  first  apply  the  FSCL  vector  quantizer  to  the  same  2-D  example  for  which 
the  competitive  learning  algorithm  failed.  The  trajectories  of  the  code  vectors  are 
shown  in  Figure  3.5.  The  FSCL  clearly  solves  the  problem  of  node  utilization  and 
produces  the  same  result  as  the  LBG  algorithm. 

The  FSCL  has  been  applied  to  the  vector  quantization  of  images  [Ref.  13] 
and  some  interesting  results  have  emerged.  Figure  3.6  shows  the  number  of  training 
iterations  required  by  the  FSCL  and  LBG  algorithms.  For  a  small  codebook,  the 
FSCL  has  a  sizable  computational  advantage,  while  for  larger  codebooks  the  LBG 
algorithm  is  more  efficient.  To  get  an  idea  of  how  codebook  size  affects  reproduction 
quality,  we  have  applied  the  FSCL  algorithm  using  various  codebook  sizes.     The 
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training  required  for  LBG  and  FSCL  algorithms 
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Figure  3.6  Training  Required  For  LBG  and  FSCL  Algorithms 
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Figure  3.7  Original  Image 


Figure  3.8  FSCL  Using  a  Size  16 
Code  book  and  a  2x2  Block 


Figure  3.9  FSCL  Using  a  Size  64 
Codebook  and  a  3x2  Block 


Figure  3.10  FSCL  Using  a  Size  512 
Codebook  and  a  3x3  Block 
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original  image  is  256  x  256  pixels  (See  Figure  3.7)  and  was  divided  into  blocks  of 
various  sizes  to  produce  a  data  rate  of  1  bit/pixel  for  each  example.  Figure  3.8  shows 
an  image  produced  with  a  block  size  of  2  x  2  and  a  codebook  size  of  16.  Figure  3.9 
shows  an  image  produced  using  a  block  size  of  3  x  2  and  a  codebook  size  of  64. 
Figure  3.10  shows  an  image  produced  using  a  3  x  3  block  and  a  codebook  size  of 
512.  We  can  clearly  see  that  the  larger  codebooks  produce  a  much  better  quality 
of  reproduction  at  the  same  data  rate.  This  leaves  us  with  the  question  of  how  to 
get  the  good  reproduction  quality  of  large  codebooks  while  also  taking  advantage  of 
the  computational  efficiency  of  the  FSCL  algorithm  for  generating  small  codebooks. 
The  next  chapter  demonstrates  several  techniques  that  can  be  applied  to  the  FSCL 
algorithm  which  allow  us  to  form  large  codebooks  without  the  excessive  amount  of 
training  required  by  the  original  algorithm. 
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IV.  ALGORITHM  DEVELOPMENT 

A.     INTRODUCTION 

The  previous  two  chapters  provided  an  overview  of  vector  quantization,  and 
how  neural  networks  have  been  applied  to  this  problem.  Here  we  investigate  the  lim- 
itations of  existing  algorithms,  and  propose  modifications  which  substantially  reduce 
the  computational  requirements  without  significant  loss  in  performance. 

As  we  saw  in  Figures  3.8-3.10,  the  reproduction  quality  of  a  vector  quantizer 
depends  strongly  on  the  dimensionality  of  the  vector  utilized.  We  wish  to  use  the 
maximum  dimensionality  possible,  but  we  are  limited  by  the  fact  that  the  codebook 
size  grows  exponentially  with  increasing  vector  dimension.  Whether  we  plan  to  im- 
plement the  neural  network  by  simulation  or  in  hardware,  this  limitation  introduces 
significant  difficulties. 

In  the  case  of  simulation,  the  large  capacity  memory  chips  available  today  allow 
us  to  implement  very  large  codebooks.  However,  we  can  see  from  Figure  3.6  that 
for  very  large  codebooks,  the  neural  network  algorithm  has  a  substantially  higher 
computational  cost  than  the  Linde,  Buzo,  and  Gray  (LBG)  algorithm.  So  in  order  to 
make  the  neural  network  simulation  useful,  we  must  limit  ourselves  to  small  codebooks 
and  thus  poor  performance,  or  find  a  way  to  form  a  codebook  with  a  large  effective 
size  by  combining  many  smaller  codebooks. 

In  the  case  of  hardware  implementation,  the  computational  disadvantage  of  the 
neural  network  for  large  codebook  size  is  substantially  mitigated  by  the  advantage 
gained  from  parallel  processing.  However  in  this  case,  the  codebook  size  is  now  limited 
by  the  number  of  processing  elements  which  can  be  implemented  in  hardware.  Even 
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with  expected  advances  in  neural  network  hardware,  it  is  still  important  to  maximize 
the  effective  codebook  size  for  a  given  number  of  processing  elements. 

In  the  following  sections  we  investigate  algorithms  which  improve  the  perfor- 
mance for  both  hardware  and  simulation  implementations.  These  algorithms  allow  a 
large  codebook  to  be  formed  from  many  small  codebooks,  and  allow  a  large  effective 
code  book  to  be  formed  using  a  substantially  smaller  number  of  processing  elements. 

The  vector  quantizers  we  have  examined  so  far  are  optimal  in  two  senses.  First 
the  codebook  formed  produces  the  minimum  MSE  possible  for  the  training  data 
utilized,  and  secondly  the  encoder  always  picks  the  codeword  corresponding  to  the 
vector  which  produces  the  least  distortion  for  any  given  input  vector.  This  type  of 
algorithm  is  called  full  search  vector  quantization  (FSVQ),  and  it  must  calculate  a 
number  of  distortions  equal  to  the  size  of  the  codebook  for  each  vector  processed.  As 
noted  above,  this  property  makes  full  search  codes  impractical  except  for  the  case  of 
small  codebooks. 

We  now  consider  algorithms  that  produce  codes  which  are  suboptimal  in  both 
senses  mentioned  above.  They  may  not  produce  a  codebook  which  produces  the 
minimum  MSE  for  the  training  data,  and  they  may  not  select  the  codeword  cor- 
responding to  the  smallest  distortion  available.  However  these  algorithms  produce 
codebooks  which  have  structure  that  dramatically  reduces  the  computational  effort 
required  for  a  given  codebook  size.  Although  the  performance  is  degraded  relative  to 
a  full  search  algorithm,  the  suboptimal  algorithm  can  offer  such  a  large  reduction  in 
complexity  that  a  larger  codebook  may  be  implemented.  This  in  turn  can  provide 
better  performance  at  a  smaller  computational  cost  than  the  full  search  algorithm. 
These  algorithms  are  described  below. 
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B.  TREE  SEARCHED  VECTOR  QUANTIZATION 

The  TSVQ  [Ref.  14]  design  was  developed  in  an  attempt  to  reduce  the  number 
of  distance  calculations  which  must  be  made  to  encode  a  vector.  In  the  neural  network 
implementation,  not  only  does  the  software  simulation  option  also  benefit  from  this 
reduction  in  distance  calculations,  but  we  also  see  a  reduction  in  the  amount  of 
training  required.  This  improvement  stems  from  the  fact  that  the  structure  of  the 
TSVQ  produces  data  subsets  for  which  the  basic  FSCL  algorithm  vector  quantizer 
converges  more  quickly. 

The  TSVQ  algorithm  is  a  structure  which  causes  us  to  search  a  sequence  of 
smaller  codebooks  rather  than  a  single  large  one.  This  is  accomplished  by  arranging 
many  small  vector  quantizers  in  a  tree  structure  as  shown  in  Figure  4.1.  The  tree 
is  searched  starting  with  the  root,  and  each  search  of  the  smaller  vector  quantizers 
advances  one  level  through  the  tree.  An  m  level  TSVQ  is  characterized  by  the  m-tuple 
R  =  (R\,  R2,  ■  •  •  Rm)  >  which  describes  the  number  of  bits  encoded  at  each  level  of 
the  tree.  So  each  vector  quantizer  at  level  j  would  have  2R]  codewords  and  2^>=i  ' 
vector  quantizers  are  required  to  complete  level  j.  The  codebook  size  for  the  entire 
structure  is  2^>=i    '. 

The  encoding  of  a  vector  proceeds  by  first  applying  the  input  vector,  x,  to 
the  vector  quantizer  at  the  root  of  the  tree  structure.  This  produces  the  closest 
code  vector,  yx,  which  is  our  first  estimate  of  x,  and  the  first  R\  bits  of  the  channel 
codeword.  This  i?i-tuple,  u1?  also  serves  as  the  index  of  the  vector  quantizer  to  be 
searched  in  the  next  level.  Thus  each  codeword  in  level  one  provides  a  mapping  to 
a  vector  quantizer  in  level  two.  We  then  present  x  to  the  2R2  size  vector  quantizer 
selected  at  level  two  which  produces  a  new  estimate  y2  and  the  second  portion  of  the 
channel  codeword  112.  We  use  the  vector  (111,112)  to  choose  the  vector  quantizer  to 
search  at  the  third  level.  This  process  continues  until  the  final  level  is  reached.  At  this 
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point,  we  have  produced  our  final  estimate  ym  and  the  complete  channel  codeword, 

u  =  (ui.u2 um).  This  structure  allows  us  to  encode  a  vector  using  only  X^i  ~R' 

distance  calculations  in  contrast  with  the  2  calculations  required  for  the  full  search 
method  ( R  =  R\  +  R2  +  ■  . .  Rm)  .  Table  4.1  shows  some  examples  of  how  large  the 
computational  savings  for  the  encoding  step  can  be  for  TSVQ.  The  R  vector  listed 
in  the  table  describes  the  architecture  of  the  particular  TSVQ  structure  used  in  the 
example.  This  notation  is  explained  later  in  this  section. 

TABLE  4.1:   Number  of  Encoding  Distance  Calculations  Required 


Distance  Calculations 

R 

Block  Size 

Codebook  Size 

FSVQ 

TSVQ 

(2,2) 

2  x  2 

16 

16 

8 

(3,2) 

3x2 

64 

64 

16 

(3,3,3) 

3x3 

512 

512 

24 

The  training  of  the  TSVQ  proceeds  one  level  at  a  time.  We  first  apply  the  entire 
training  set  to  the  FSCL  vector  quantizer  at  the  root  of  the  tree  until  convergence 
is  obtained.  We  then  use  the  codebook  produced  to  divide  up  the  data  set  into 
Ri  subsets  based  on  their  proximity  to  the  newly  generated  code  vectors.  The  new 
subsets  are  then  applied  to  the  Ri  vector  quantizers  on  level  two.  We  proceed  in  this 
way  until  the  vector  quantizers  at  the  final  level,  m,  have  been  trained.  Each  vector 
quantizer  in  the  tree  is  initialized  by  randomly  selecting  vectors  from  the  appropriate 
training  set.  This  type  of  initialization  speeds  convergence  of  the  neural  networks. 

This  structure  allows  us  to  greatly  reduce  the  number  of  distance  calculations 
necessary  for  the  software  simulation  case.  This  is  true  because  the  path  through  the 
tree  allows  us  to  ignore  the  vast  majority  of  code  vectors  which  are  far  from  the  input 
vector.  TSVQ  also  displays  a  property  which  is  termed  graceful  degradation.  This 
means  that  if  the  codeword  must  be  truncated  due  to  channel  capacity  considerations, 
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it  will  still  be  possible  send  a  good  estimate  of  the  data  for  this  new  lower  data  rate. 
This  is  in  contrast  to  the  full  search  method,  whose  codeword  conveys  no  useful 
information  if  it  is  truncated.  An  added  benefit  of  this  method  is  that  the  structure 
imposed  on  each  of  the  data  subsets  applied  to  the  FSCL  vector  quantizers  causes 
them  to  converge  more  quickly.  This  provides  a  substantial  reduction  in  training 
required  for  both  software  and  hardware  implementations. 

A  final  benefit  of  the  method  is  the  large  reduction  in  the  number  of  processing 
elements  required  for  hardware  implementation.  Since  the  TSVQ  algorithm  updates 
only  the  weights  of  the  vector  quantizers  of  the  path  taken  for  each  input  vector 
applied,  these  are  the  only  vector  quantizers  that  must  be  realized  in  hardware.  Thus 
we  can  convert  the  hardware  implementation  from  a  tree  structure  to  a  linear  structure 
(see  Figure  4.2)  along  with  memory  and  a  system  to  load  the  appropriate  weights  for 
each  level.  Thus  we  can  reduce  the  number  of  processing  elements  required  from 
Yl^Li  ^lj=iRj  to  X^i  Ri>  Table  4.2  shows  some  examples  of  the  number  of  processing 
elements  required  if  the  TSVQ  is  implemented  in  hardware  using  tree  structure  and 
linear  structure.  For  larger  block  sizes  and  code  book  sizes,  the  savings  is  substantial. 

TABLE  4.2:  Number  of  Processing  Elements  Required 


PE's  Required 

R 

Block  Size 

Codebook  Size 

Tree  Structure 

Linear  Structure 

(2,2) 

2x2 

16 

20 

8 

(3,3) 

3  x  2 

64 

72 

16 

(3,3,3) 

3x3 

512 

574 

24 

For  the  simulations,  a  single  256  x  256  pixel  image  was  utilized.  This  image 
was  divided  into  blocks  of  various  sizes  to  achieve  a  data  rate  of  1  bit/pixel  for  each 
example.  The  1  bit/pixel  provided  a  standard  to  allow  comparisons  between  examples 
with  different  codebook  sizes,  and  provided  a  challenging  enough  problem  to  allow 
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Figure  4.1  Tree  Search  Vector  Quantization 
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Figure  4.2  Linear  Hardware  Implementation  Of  TSVQ 
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good  comparisons  to  be  made. 

The  simulation  results  for  TSVQ  are  shown  in  Figures  4.3-4.6.  The  original 
image  is  shown  in  Figure  4.3.  TSVQ  with  a  block  size  of  2  x  2,  and  a  size  16 
codebook  constructed  from  five  size  4  codebooks  arranged  in  a  two  level  tree,  is  shown 
in  Figure  4.4.  TSVQ  with  a  block  size  of  3  x  2,  and  a  size  64  codebook  constructed 
from  nine  size  8  codebooks  arranged  in  a  two  level  tree,  is  shown  in  Figure  4.5.  TSVQ 
with  a  block  size  of  3  x  3,  and  a  size  512  codebook  constructed  from  73  size  8  codebooks 
arranged  in  a  three  level  tree,  is  shown  in  Figure  4.6.  It  is  easy  to  see  the  strong  effect 
of  codebook  size  on  performance  by  noting  the  improvement  in  subjective  quality 
as  the  codebook  size  is  increased  from  16  to  64  to  512.  In  particular,  the  larger 
codebook  sizes  display  an  image  that  appears  sharper  because  the  small  code  books 
does  not  contain  a  sufficient  number  of  code  vectors  to  represent  edges  well.  Also,  the 
small  code  book  does  not  contain  code  vectors  with  enough  different  grey  scales  to 
reproduce  gradually  changing  intensities,  such  as  those  in  the  top  of  the  hat  or  near 
the  beam  to  the  left  of  the  hat.  This  is  confirmed  by  the  MSE  performance,  which  is 
displayed  in  Figure  4.7.  Comparing  the  MSE  performance  of  TSVQ  to  the  full  search 
algorithm,  we  can  see  that  the  loss  of  performance  is  very  small.  This  is  reinforced  by 
comparing  Figures  4.4-4.6  for  TSVQ  and  Figures  3.8-3.10  for  full  search,  which  show 
that  the  degradation  caused  by  use  of  the  TSVQ  method  is  small  in  the  subjective 
sense  as  well. 

To  give  an  idea  of  the  refinement  taking  place  at  each  level,  each  stage  of  the 
three  stage  TSVQ  example  in  Figure  4.6  is  shown  in  Figures  4.8-4.10.  The  improve- 
ment taking  place  at  each  level  is  clear.  We  can  also  get  a  good  idea  of  what  would  be 
reconstructed  if  the  code  were  truncated.  Figure  4.8  corresponds  to  0.33  bits/pixel. 
Figure  4.9  corresponds  to  0.67  bits/pixel,  and  Figure  4.10  corresponds  to  1.0  bit/pixel. 
It  is  apparent  that  a  degraded  but  nevertheless  useful  image  is  still  available  if  the 
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Figure  4.3  Original 


Figure  4.4  TSVQ  Using  a  Size  16 
Code  book  and  a  2x2  Block 


Figure  4.5  TSVQ  Using  a  Size  64 
Code  book  and  a  3x2  Block 


Figure  4.6  TSVQ  Using  a  Size  512 
Codebook  and  a  3x3  Block 
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Performance  vs.  Block  Size 
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Figure  4.7  Performance  vs.  Block  Size 
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Figure  4.8  TSVQ  Using  3x3  Block 
First  Stage 


Figure  4.9  TSVQ  Using  3x3  Block 
Second  Stage 
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Figure  4.10  TSVQ  Using  3x3  Block 
Third  Stage 
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code  is  truncated.  This  is  the  property  termed  previously  as  graceful  degradation. 

The  improvement  in  computational  cost  can  be  seen  in  Figure  4.11.  For  the 
three  examples,  the  savings  varied  from  6S  to  72  percent.  In  other  words  the  TSV'Q 
algorithm  required  only  about  1/3  to  1/4  the  computation.  As  stated  before,  this 
advantage  is  a  result  of  utilizing  smaller  more  efficient  codebooks  to  form  a  single  large 
effective  codebook.  This  large  computational  advantage  is  gained  at  an  very  modest 
loss  of  performance.  This  makes  the  TSVQ  an  extremely  attractive  alternative  to  the 
FSCL  algorithm. 

C.     MULTI  STAGE  VECTOR  QUANTIZATION 

We  saw  in  the  last  section  that  the  TSVQ  algorithm  offers  many  advantages 
for  neural  network  vector  quantizers,  but  that  some  troublesome  limitations  remain. 
First,  for  both  TSVQ  and  FSVQ,  the  load  on  the  channel  of  transmitting  updates  for 
very  large  codebooks  can  be  excessive.  Second,  even  though  TSVQ  can  reduce  the 
training  effort,  a  large  number  of  passes  through  the  image  is  still  required  for  good 
performance.  Finally,  although  TSVQ  produces  a  codebook  with  structure,  it  actually 
increases  the  storage  required  for  the  code  book.  We  now  examine  the  application 
of  a  technique  termed  Multiple  Stage  Vector  Quantization  (MSVQ)  [Ref.  15]  to  the 
basic  FSCL  vector  quantizer.  This  technique  has  the  advantage  of  further  reducing 
the  computational  cost  and  allowing  a  very  efficient  hardware  implementation. 

Like  TSVQ,  MSVQ  has  two  or  more  levels,  but  instead  of  working  with  the 
original  input  vector  at  each  stage  as  in  TSVQ,  MSVQ  attempts  to  encode  the  error 
generated  at  the  previous  level.  An  m  level  MSVQ  (see  Figure  4.12)  can  be  described 
by  the  m-tuple  R  =  (R\,  R2,  ■  ■  ■ ,  Rm),  where  R{  is  the  number  of  bits  used  to  encode 
the  error  at  level  i  of  fhe  MSVQ.  The  first  level  of  the  MSVQ  is  just  a  normal  FSCL 
vector  quantizer.  The  input  vector,  x,  is  applied  to  the  vector  quantizer  at  level  one 
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Figure  4.12  Multi  Stage  Vector  Quantization 


Figure  4.13  Classification  Vector  Quantization 
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and  the  first  estimate,  yi,  is  produced  along  with  the  first  Rx  bits  of  the  channel 
codeword.  Ui.  Next,  the  first  error  vector  is  formed  by  taking  the  vector  difference, 
x  —  yi.  This  error  vector,  ei  is  then  applied  to  the  size  2R2  vector  quantizer  at  level 
two.  which  produces  an  estimate  of  the  error  vector,  e^,  and  the  next  R2  bits  of  the 
channel  codeword.  So  at  the  second  level,  our  estimate  of  the  input  vector  is  the 
vector  sum  y2  =  yi  +  h\.  In  the  following  stages,  we  continue  to  form  an  error  vector 
from  the  previous  stage  and  use  a  FSCL  vector  quantizer  to  encode  this  error.  Each 
stage  produces  an  estimate  for  the  error  and  a  portion  of  the  channel  codeword.  At 
the  last  stage,  the  error  vector  em_i  is  encoded  and  the  final  estimate  of  the  input 
vector  is  available  by  performing  ym  =  yi  +  e^  +  e2  +  .  .  .  +  ©m-i,  and  the  full  channel 
codeword  u  =  (ui,  u2, . .  . ,  um). 

For  encoding,  MSVQ  requires  52£Lj  ~R%  distance  calculations  which  is  the  same 
as  TSVQ  and  much  less  than  the  2R  required  for  FSVQ.  However,  the  MSVQ  requires 
only  m  vector  quantizers  and  thus  m  small  codebooks  to  be  stored  as  compared  with 
IZi^i  n^_ti?j  for  TSVQ.  Table  4.3  shows  the  difference  in  the  number  of  codebooks 
required  by  TSVQ  and  for  some  of  the  examples  used  in  simulations.  This  reduces 
the  total  number  of  code  vectors  to  be  stored  from  £21£=i  Hl]=lRj  for  TSVQ  and  2R 
for  FSVQ  to  J27L1  2fi'  for  MSVQ.  Table  4.4  shows  the  total  number  of  code  vectors 
which  must  be  stored  for  several  examples  of  FSVQ,  TSVQ,  and  MSVQ.  We  can  see 
that  there  is  a  storage  price  to  be  paid  for  the  computational  advantage  of  TSVQ, 
but  the  MSVQ  provides  a  large  reduction  in  both.  This  dramatically  reduces  both 
storage  requirements  and  the  load  on  the  channel  from  transmitting  codebook  up- 
dates. Table  4.5  shows  the  extra  load  on  the  channel  for  each  of  the  three  algorithms 
assuming  that  the  codebook  is  updated  with  each  frame.  The  advantage  of  MSVQ 
in  this  regard  for  large  codebooks  is  apparent. 

As  with  TSVQ,  the  training  of  the  MSVQ  proceeds  one  level  at  a  time.  We  apply 
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the  original  data  set  to  FSCL  vector  quantizer  at  the  first  level  until  convergence  is 
obtained.  Then  we  pass  the  data  through  the  trained  vector  quantizer  and  compute 
the  error  vector  between  each  input  vector  and  the  closest  code  vector.  This  forms 
a  new  data  set  which  is  a  collection  of  the  first  stage  errors.  This  first  stage  error 
data  set  is  then  applied  to  the  vector  quantizer  on  the  second  level  until  convergence 
is  obtained;  then  it  is  applied  a  final  time  to  compute  the  second  stage  error  vectors. 
This  continues  until  the  last  stage  has  been  trained.  Although  the  data  subsets 
produced  by  MSVQ  do  not  have  the  same  desirable  structure  as  the  data  subsets 
from  TSVQ,  there  are  far  fewer  codebooks  for  MSVQ  to  train.  Indeed  we  find  that 
the  smaller  number  of  codebooks  outweigh  the  larger  convergence  time  in  all  cases 
except  for  very  small  overall  codebooks.  Thus  the  MSVQ  requires  significantly  fewer 
training  passes  than  TSVQ  to  reach  convergence. 

It  is  useful  at  this  point  to  examine  the  differences  between  MSVQ  and  TSVQ. 
Both  methods  produce  a  multi-level  process,  but  the  processing  at  each  level  is  sig- 
nificantly different.  The  TSVQ  algorithm  presents  the  original  data  vector  at  each 
level,  while  the  MSVQ  presents  the  residual  error  at  each  level.  TSVQ  has  an  ever 
increasing  number  of  vector  quantizers  at  each  level,  while  MSVQ  has  a  single  vector 
quantizer  at  each  level.  TSVQ  provides  increasingly  accurate  estimates  of  the  in- 
put at  each  level  by  systematically  dividing  the  higher  dimensional  vector  space  into 
smaller  and  smaller  subspaces  into  which  the  input  must  fall.  MSVQ  provides  an 
initial  estimate  at  the  first  level,  and  provides  a  better  estimate  at  each  level  by  con- 
tinuing to  add  smaller  and  smaller  correction  terms  in  a  way  similar  to  the  method  of 
successive  approximations.  Each  of  these  corrections  is  a  result  of  performing  vector 
quantization  on  the  error  subspace  of  the  preceding  level.  In  TSVQ,  the  code  vectors 
at  intermediate  levels  are  not  actually  utilized  for  reconstruction;  they  are  only  used 
as  pointers  to  direct  the  algorithm  to  the  appropriate  vector  quantizer  at  the  final 
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level.   Only  code  vectors  of  the  vector  quantizers  at  the  final  level  are  actually  used 
in  the  image  reconstruction. 

The  reconstructed  images  for  MSVQ  are  presented  in  Figures  4.14-4.17.  As  in 
the  previous  results,  the  simulations  were  conducted  on  a  single  test  image  of  256  x  256 
pixels.  This  image  was  divided  into  blocks  of  various  sizes  chosen  to  yield  a  data  rate 
of  1  bit/pixel  for  each  reconstruction  using  a  variety  of  codebook  sizes.  The  original 
image  is  presented  in  Figure  4.3.  MSVQ  using  a  2  x  2  block  and  a  codebook  size 
of  16  is  presented  in  Figure  4.14.  This  codebook  was  generated  using  a  two  level 
architecture  containing  two  code  books  of  size  4.  MSVQ  using  a  3  x  2  block  and  a 
codebook  of  size  64  is  presented  in  Figure  4.15.  This  codebook  was  generated  using  a 
two  level  architecture  containing  two  codebooks  of  size  8.  MSVQ  using  a  3  x  3  block 
and  a  code  size  of  512  is  presented  in  Figure  4.16.  This  codebook  was  generated  using 
a  three  level  architecture  containing  three  codebooks  of  size  8.  MSVQ  using  a  4  x  3 
block  and  a  codebook  size  of  8192  is  presented  in  Figure  4.17.  This  codebook  was 
generated  using  a  three  level  architecture  containing  three  codebooks  of  size  16. 

We  also  present  one  example  of  how  the  image  develops  through  each  stage  of 
the  MSVQ  process.  Figures  4.18-4.20  show  each  stage  for  the  example  presented  in 
Figure  4.16.  As  we  saw  with  TSVQ,  the  improvement  is  each  stage  is  easy  to  see. 
The  property  of  graceful  degradation  is  also  manifested  by  MSVQ,  since  the  figures 
shown  correspond  to  the  lower  bit  rate  images  that  would  be  produced  if  the  channel 
codewords  were  truncated. 

As  with  FSVQ  and  TSVQ,  we  can  see  that  the  performance  of  MSVQ  depends 
strongly  on  the  size  of  the  codebook.  The  performance  of  MSVQ  falls  far  short  of  the 
standard  set  by  FSVQ  as  can  be  seen  in  the  MSE  comparison  shown  in  Figure  4.21. 
The  reason  for  this  large  degree  of  suboptimality  can  be  seen  in  the  structure  of 
MSVQ.  Consider  a  TSVQ  structure  in  which  we  formed  the  data  subsets  for  the  next 

46 


Figure  4.14  MSVQ  Using  a  Size  16 
Code  book  and  a  2x2  Block 


Figure  4.15  MSVQ  Using  a  Size  64 
Codebook  and  a  3x2  Block 


Figure  4.16  MSVQ  Using  a  Size  512 
Codebook  and  a  3x3  Block 


Figure  4.17  MSVQ  Using  a  Size  40% 
Codebook  and  a  4x3  Block 
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Figure  4.18  MSVQ  Using  3x3  Block 
First  Stage 


Figure  4.19  MSVQ  Using  3x3  Block 
Second  Stage 


Figure  4.20  MSVQ  Using  3x3  Block 
Third  Stage 
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Performance  vs.  Block  Size 


Figure  4.21  Performance  vs.  Block  Size 
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level  using  the  error  vector  instead  of  the  original  input  vector.  Since  the  vector 
quantization  process  is  translation  invariant,  the  performance  of  this  new  structure 
would  be  identical  to  the  original  TSVQ.  We  can  also  see  that  this  structure  is  the 
same  as  MSVQ  except  that  a  different  codebook  is  used  to  encode  the  error  vectors 
for  each  branch  of  the  tree.  Thus  MSVQ  is  equivalent  to  TSVQ  if  we  assume  that  the 
probability  distribution  function  which  describes  the  distribution  of  the  errors  about 
each  code  vector  on  the  same  level  of  the  tree  is  identical.  That  this  assumption  is  far 
from  the  truth  accounts  for  the  relatively  poor  performance  of  the  MSVQ  algorithm. 
Although  the  performance  of  MSVQ  is  poor  relative  to  FSVQ  and  TSVQ  for 
codebooks  of  the  same  size,  MSVQ  maintains  several  highly  desirable  features.  We  can 
see  from  Figure  4.22  that  MSVQ  provides  a  huge  computational  advantage  for  large 
codebooks.  MSVQ  also  provides  an  extremely  simple  structure  which  would  require 
only  a  small  number  of  processing  elements  and  would  make  hardware  implementation 
much  simpler.  Finally,  because  MSVQ  uses  only  one  vector  quantizer  per  level,  the 
algorithm  vastly  reduces  the  amount  of  storage  required  for  simulation  and  decreases 
the  load  on  the  transmission  channel  due  to  codebook  transmission. 

D.      CLASSIFICATION  VECTOR  QUANTIZATION 

The  refinements  to  the  basic  FSCL  algorithm  that  we  have  examined  so  far  con- 
centrate on  reducing  the  computational  cost  of  training  the  vector  quantizer  system. 
Our  standard  for  performance  in  all  cases  has  been  the  mean  square  error.  Now  we 
take  a  brief  look  at  the  subjective  quality  of  the  images  produced.  The  most  notice- 
able problem  with  each  of  the  methods  is  the  staircase  effect  .  This  is  where  an  edge 
follows  the  outline  of  the  blocks  rather  than  the  smooth  edge  of  the  original  image 
as  can  be  seen  by  examining  the  curve  in  the  shoulder  in  Figures  4.3  and  4.5.  This 
staircase  effect  follows  the  size  of  the  block  used  in  coding  the  image,  and  will  thus 
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become  more  and  more  noticeable  as  the  block  size  is  increased.  This  puts  us  in  the 
uncomfortable  situation  of  wanting  to  increase  the  block  size  to  improve  mean  square 
error  performance  and  at  the  same  time  wanting  to  reduce  the  block  size  to  minimize 
this  staircase  effect.  In  order  to  solve  this  dilemma  we  need  to  examine  the  cause  of 
this  .staircase  effect  and  look  at  possible  solutions. 

One  possible  cause  is  that  the  codebook  does  not  contain  a  sufficient  variety 
of  code  vectors  which  represent  blocks  with  edges.  To  examine  this  possibility,  a 
codebook  for  a  FSVQ  with  a  2  x  2  block  size  is  presented  in  table  4.6.  The  four  pixel 
values  in  each  row  constitute  a  code  vector.  We  would  expect  a  code  vector  which 
represents  an  edge  to  contain  both  high  and  low  values,  but  upon  examining  the 
codebook  in  table  4.6,  we  see  that  the  code  vectors  exhibit  almost  no  structure  and 
are  certainly  inadequate  to  represent  all  the  possible  edge  configurations.  To  examine 
the  reason  for  this  under-representation  of  edge  blocks,  we  introduce  an  edge  detector 
which  is  used  to  indicate  whether  an  edge  appears  somewhere  in  the  block. 

For  each  set  of  adjacent  pixels  in  the  block,  we  take  the  pixel  values,  mi  and 
mo  and  form  the  ratio         ,~m2'  >   and  apply  a  threshold  to  determine  if  this  is  an 

max(mi,m2)  r  ~  J 

edge  block  or  a  shade  block.  The  results  of  applying  this  ratio  to  our  test  image  is 
presented  in  Figure  4.23  for  a  block  size  of  2  x  2.  The  authors  of  [Ref.  16]  chose 
a  threshold  of  0.4  to  define  an  edge  block.  Applying  this  value  gives  us  only  202 
edge  blocks  out  of  a  total  of  16384  blocks  in  the  image.  Thus  it  appears  that  the 
problem  with  edges  occurs  because  there  are  so  few  edge  blocks  in  the  image,  and 
the  poor  representation  of  these  blocks  do  not  contribute  significantly  to  the  mean 
square  error.  So  the  root  of  the  problem  seems  to  be  that  the  distortion  measure,  i.e., 
mean  square  error,  fails  to  take  into  account  the  perceptual  importance  of  the  edge 
blocks.  This  leaves  two  basic  solutions;  change  to  a  more  complicated,  perceptually 
based  distortion  measure,  or  divide  the  problem  by  using  separate  vector  quantizers 
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Figure  4.23  Histogram  of  Edge  Detector  Ratio  Values 
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on  the  edge  and  shade  blocks. 

The  technique  of  Classification  Vector  Quantization  (CVQ)  [Ref.  16]  (  See 
Figure  4.13)  uses  the  second  method  discussed  above  to  improve  the  subjective  quality 
o(  the  reconstructed  image.  The  image  is  divided  into  blocks  as  before,  but  now  we 
apply  the  edge  detector  and  use  a  threshold  to  separate  the  image  into  two  data  sets 
one  containing  the  edge  blocks  and  the  other  containing  the  shade  blocks.  These  two 
data  sets  are  then  applied  separately  to  a  FSCL  vector  quantizer  which  is  trained 
until  convergence.  The  two  resulting  codebooks  are  then  concatenated  to  form  an 
overall  codebook  which  emphasizes  the  edge  blocks  .  The  amount  of  emphasis  given 
to  the  edge  blocks  depends  on  the  sizes  of  the  codebooks  allocated  to  the  edges  and 
shades.  For  example,  a  codebook  size  of  64  could  be  divided  into  48  shade  code 
vectors  and  16  edge  code  vectors.  This  would  give  the  edges  more  emphasis  than  the 
original  technique.  Even  further  emphasis  would  be  obtained  if  we  used  32  shade  and 
32  edge  code  vectors  instead. 

The  simulation  results  for  CVQ  are  presented  in  Figures  4.24-4.26.  As  before 
a  single  test  image  of  256  x  256  pixels  was  used,  and  all  test  cases  were  conducted 
at  1  bit/pixel.  Figure  4.24  shows  CVQ  using  a  2  x  2  block  and  a  size  16  codebook 
consisting  of  8  edge  and  8  shade  code  vectors.  Figure  4.25  shows  CVQ  using  a  3  x  2 
block  and  a  size  64  codebook  consisting  of  32  edge  and  32  shade  pixels.  Figure  4.26 
shows  CVQ  using  a  3  x  3  block  and  a  size  512  codebook  consisting  of  384  edge  and  128 
shade  code  vectors.  For  the  size  16  case  (Figure  4.24)  we  can  see  that  the  codebook 
is  just  too  small  to  represent  shades  and  edges  well.  The  lack  of  enough  shade  code 
vectors  to  cover  the  common  grey  levels  is  evident,  and  the  few  edge  code  vectors  are 
not  enough  to  show  much  improvement  over  FSVQ.  In  the  size  64  case  (Figure  4.25) 
we  start  to  see  some  substantial  improvement  in  the  reproduction  of  the  edges  with 
very  little  degradation  in  other  areas  of  the  image.    Finally,  for  the  size  512  case 
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Figure  4.24  CVQ  Using  a  Size  16 
Codebook  and  a  2x2  Block 


Figure  4.25  CVQ  Using  a  Size  64 
Codebook  and  a  3x2  Block 


Figure  4.26  CVQ  Using  a  Size  512 
Codebook  and  a  3x3  Block 


55 


(Figure  4.26),  CVQ  is  substantially  better  in  a  subjective  sense,  and  for  the  larger 
codebooks  and  larger  block  sizes  it  is  slightly  better  than  FSVQ  in  the  mean  square 
error  sense  (See  Figure  4.27).  It  is  surprising  that  any  method  could  surpass  the 
performance  of  FSVQ  since  we  belieived  this  method  to  be  optimal  in  a  mean  square 
sense,  but  this  effect  probably  stems  from  the  fact  that  FSVQ  converges  very  slowly, 
and  the  test  cases  were  not  run  a  sufficient  number  of  training  passes  to  reach  the 
final  value. 

A  secondary  benefit  of  applying  the  CVQ  technique  is  an  enormous  computa- 
tional savings  over  FSVQ.  This  occurs  because  the  code  vectors  for  edge  and  shade 
pixels  appear  to  converge  at  different  rates.  The  shade  code  vectors  have  a  very  simple 
structure  and  therefore  converge  quickly,  while  the  edge  code  vectors  have  a  complex 
structure  and  converge  slowly.  In  FSVQ,  we  use  a  single  codebook  and  thus  all  code 
vectors  are  run  through  the  data  set  the  same  number  of  times.  So  long  after  the 
shade  code  vectors  have  converged,  we  continue  to  waste  computational  time  updat- 
ing them.  In  CVQ,  we  avoid  this  problem,  and  we  are  then  able  to  concentrate  our 
computational  efforts  on  the  difficult  part  of  the  problem.  Also  as  we  have  seen  with 
TSVQ,  a  data  set  which  has  a  large  amount  of  structure  makes  the  FSCL  algorithm 
converge  more  quickly.  The  CVQ  method  accomplishes  this  by  splitting  the  original 
data  set  into  shade  and  edge  blocks  which  further  improves  convergence  speed.  As 
we  can  see  in  Figure  4.28,  CVQ  has  a  huge  computational  advantage  over  FSVQ  as 
well  as  better  performance  for  large  codebooks. 
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TABLE  4.3:   Number  of  Codebooks  Re« 

quired 

Codebooks 

R 

Block  Size 

Codebook  Size 

TSVQ 

MSVQ 

(2,2) 

2  x  2 

16 

5 

o 

(3,2) 

3  x  2 

64 

9 

o 

(3,3,3) 

3x3 

512 

73 

3 

TABLE  4.4:   Code  Vector  Storage  Requirements 


Code  Vectors 

R 

Block  Size 

Codebook  Size 

FSVQ 

TSVQ 

MSVQ 

(2,2) 

2x2 

16 

16 

20 

8 

(3,3) 

3x2 

64 

64 

72 

16 

(3,3,3) 

3x3 

512 

512 

584 

24 

TABLE  4.5:   Channel  Load  of  Codebook  Transmission  (bits/pixel) 


Channel  Load  (bits/pixel) 

R 

Block  Size 

Codebook  Size 

FSVQ 

TSVQ 

MSVQ 

(2,2) 

2x2 

16 

0.008 

0.010 

0.004 

(3,3) 

3x2 

64 

0.047 

0.053 

0.012 

(3,3,3) 

3x3 

512 

0.563 

0.642 

0.026 
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TABLE  4.6:  Example  Codebook 


Codeword 

Pixel  1 

Pixel  2 

Pixel  3 

Pixel  4 

1 

159 

167 

186 

193 

2 

200 

200 

202 

202 

3 

104 

105 

123 

127 

4 

86 

86 

87 

87 

5 

223 

223 

224 

223 

6 

133 

126 

107 

104 

7 

217 

217 

217 

217 

8 

137 

137 

141 

141 

9 

208 

209 

209 

209 

10 

231 

231 

231 

231 

11 

175 

174 

175 

175 

12 

160 

159 

156 

156 

13 

99 

99 

99 

99 

14 

240 

M0 

240 

240 

15 

189 

.89 

190 

190 

16 

204 

199 

177 

166 
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Figure  4.27  Performance  vs.  Block  Size 
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V.  CONCLUSIONS 

In  this  thesis  we  examined  some  existing  algorithms  to  implement  vector  quan- 
tization using  neural  networks.  We  also  applied  three  techniques  to  improve  perfor- 
mance and  reduce  computational  cost.  In  the  previous  chapter  we  presented  each 
technique  separately.  Here  we  will  compare  the  relative  performance  of  each  of  the 
three  algorithms.  Since  each  algorithm  has  its  strengths  and  weaknesses,  we  also 
make  suggestions  about  the  likely  situations  where  each  of  these  techniques  may  be 
appropriate. 

First  let  us  discuss  image  reproduction  quality.  It  can  seen  from  Figure  4.27 
that  for  a  given  'lock  size,  FSVQ,  TSVQ,  and  CVQ  all  offer  a  similar  level  of  perfor- 
mance in  a  mean  square  sense,  while  MSVQ  performs  noticeably  worse.  To  compare 
performance  in  a  subjective  sense,  we  present  the  best  results  obtained  for  each  tech- 
nique in  the  following  figures.  Figure  5.1  shows  the  FSVQ  algorithm  using  a  3  x  3 
Block,  Figure  5.2  shows  TSVQ  using  a  3  x  3  block,  Figure  5.3  shows  MSVQ  using 
a  4  x  3  Block,  and  Figure  5.4  shows  CVQ  using  a  3  x  3  Block.  Here  we  see  that 
CVQ  has  a  small  advantage  over  FSVQ  and  TSVQ  in  a  subjective  sense,  and  MSVQ 
is  again  noticeably  worse. 

Now  we  examine  the  issue  of  computational  cost.  We  can  see  from  Figure  4.28 
that  for  a  given  block  size,  FSVQ  has  the  highest  computational  cost,  TSVQ  is  the 
next  highest,  and  CVQ  and  MSVQ  have  very  similar  and  much  smaller  computational 
costs.  Perhaps  a  better  way  to  rate  the  computational  cost  is  to  relate  it  to  perfor- 
mance. Figure  5.5  shows  the  relationship  between  cost  and  performance  for  each  test 
case  performed.  Algorithms  that  are  most  desirable  are  represented  by  points  in  the 
lower  left  portion  of  the  graph.  We  can  see  that  the  best  combination  of  reproduction 
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Figure  5.1  FSVQ 


Figure  5.2  TSVQ 


Figure  5.3  MSVQ 


Figure  5.4  CVQ 
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quality  and  computational  cost  is  given  by  the  CVQ  algorithm. 

Although  CVQ  offers  the  best  reproduction  quality  even  when  computational 
cost  is  considered,  the  other  two  algorithms  presented  have  advantages  of  their  own. 
Both  MSVQ  and  TSVQ  offer  a  huge  savings  in  the  number  of  processing  elements 
required  because  of  the  linear  structure  each  displays.  Thus  for  a  hardware  imple- 
mentation these  two  techniques  should  be  considered.  Also  we  have  seen  that  for 
large  code  book  sizes  the  load  on  the  channel  due  to  code  book  transmission  becomes 
significant.  So  if  our  application  requires  an  extremely  large  code  book  the  MSVQ 
algorithm  must  be  considered  as  it  is  able  to  form  a  large  code  book  with  very  little 
load  on  the  channel  (See  Table    4.5). 

This  research  has  shown  that  neural  networks  can  be  very  effective  in  the  im- 
plementation of  vector  quantization.  With  the  application  of  algorithms  such  as 
CVQ,  TSVQ,  and  MSVQ,  we  can  improve  the  performance  of  neural  network  vector 
quantizers  and  make  application  of  the  vector  quantization  technique  more  practical. 

A.      ADDITIONAL  WORK 

Research  is  planned  in  the  area  of  adaptive  filters  in  an  effort  to  improve  the 
convergence  speed  of  the  FSCL  algorithm.  In  addition,  it  is  planned  to  investigate 
other  current  vector  quantization  techniques  and  determine  if  neural  network  vector 
quantizers  can  be  improved  by  their  application.  After  these  steps  are  completed,  an 
effort  to  combine  several  of  the  techniques  chosen  will  be  conducted  in  the  hope  of 
further  improving  overall  performance.  Finally,  we  plan  to  apply  the  techniques  in 
this  thesis  to  the  coding  of  speech  data. 
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APPENDIX  A:  PROGRAM  DETAILS 

This  appendix  contains  the  program  flowcharts  and  listings  for  each  of  the 
algorithms  in  the  thesis.  Figures  A.1-A.4  show  the  flowcharts,  and  the  program 
listings  follow. 
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Figure  A.  1  Basic  FSCL  Algorithm 
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Figure  A.2  Tree  Search  Algorithm 
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function  x=imgconl(y) 

%  program  to  convert  an  array  of  image  data  into  vector  format  using 
%  blocks  of  arbitrary  size 

%  input  variable  y  ■  subject  image  in  row  format 

%    output  variable  x  =  subject  image  in  vector  format 

%  local  variables  N  =  vector  which  stores  the  dimensions  of  y 
n2  =  height  of  desired  block  input  by  user 
nl  =  width  of  block  input  by  user 

nla  ■  number  of  blocks  to  process  in  horizontal  direction 
n2a  -  number  of  blocks  to  process  in  vertical  direction 
k  =    index  to  track  number  of  blocks  processed 
11  =  vertical  placekeeper  in  subject  image 
jl  ■  horizontal  placekeeper  in  subject  image 
z  =  temporary  storage  for  desired  block 

initialize  dimensions  of  input  image 
get  height  og  block  from  user 
get  width  of  block  from  user 
find  I  of  blocks  to  process  horiz. 
find  I  of  blocks  to  process  vertic. 
initialize  output 

main  loop:  move  vertically  in  image 
set  vertical  placekeeper 
inner  loop:  move  horizontally  in  image 
track  number  of  blocks  processed 
set  horizontal  placekeeper 
get  desired  block  from  image 
make  conversion  from  block  to  vector 
>nd 
end 


N=size  (y) ; 

% 

n2=input (' height  of  block 

)  ; 

% 

nl=input (' width  of  block 

)  ; 

% 

nla=f loor  (N  (1) /nl) ; 

\ 

n2a=floor(N(2) /n2) ; 

% 

x= zeros (nl *n2, nla*n2a) ; 

% 

for  i=l : n2a 

% 

il-(i-l) *n2+l 

\ 

for  j=l : nla 

\ 

k=(i-l) *nla+j; 

% 

jl-< j-1) *nl+l; 

% 

z=y(il:il+n2-l,  jl:  jl+nl- 

-1)  ; 

% 

x<  :,  k)-z  (  :)  ; 

% 

end 
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function  [w, u ) =cbinit2 (x) 

%    program  to  initialize  the  code  book 

%  input  variable  x  =  data  set  in  vector  format 

%  output  variable  w  =  initial  code  book 

\  u  =  initialized  frrquency  vector 

%  local  variables  N  =  desired  number  of  code  words 
%  Nx  =  number  of  data  vectors 

%  this  program  initializes  the  codebook  by  randomly  selecting  data  vectors  from  the 
%  subject  data  set.  It  also  sets  up  hte  initial  frequency  vector  for  the  codebook 
%  with  all  values  initialized  to  1. 


N=input ('  number  of  code  words 

rand  ( ' uni  form'  ) 
Nx=max (size  (x)  )  ; 
for  i=l:N 

w (:,  i)=x(:,  ceil (Nx* rand  ( 1) )) ; 

end 

u  =  ones  ( 1, N) ; 


)  ; 
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function  (w, u ) =f scl (x, w, u) 

%  program  to  implement  frequency  sensitive  competitive  learning 

% 

%  input  variables  x  =  subject  data  set  arranged  into  vectors  of  appropriate  size 

%  w  =  existing  weight  matrix 

%  u  =  existing  win  frequency  vector 


%    output  variables  w 
•fc  u 


updated  weight  matrix 
updated  win  frequency  vector 


%  local  variables  nx  -  size  of  data  vectors 


Nx  =  number  of  data  vectors    Caution:  Nx  must  be  >  nx 

N   =  vector  containing  the  size  and  number  of  weight  vectors  in  w 

y   =  ones  vector  used  to  set  up  comparison  of  distances 

d   =  vector  which  stores  the  distance  for  each  code  vector 

md  =  the  minimum  distance  contained  in  d 

iw  =  the  index  of  code  vector  with  minimum  distance 

ep  =  learning  rate 


%  This  program  conducts  a  single  pass  through  data  set  x  using  the  FSCL  algorithm.  The 

%  weight  matrix,  w,  and  win  frequency  vector,  u,  are  updated  and  passed  back  to  the  calling 

%  routine . 


nx=min (size (x) ) ; 
Nx=max (size (x) ) ; 
N=size  (w) ; 
y=ones(l,N(2) ) ; 


t  initialize  size  of  data  vector 

%  initialize  number  of  data  vectors 

%  initialize  dimensions  of  weight  matrix 

%  initialize  ones  vector 


for  k=l:Nx 

d=sum((x(:,k)*y-w) ,"2) ; 

d-d. *u; 

(md,  iw] =min  (d)  ; 

ep=0 . Ol'exp (-u (iw) / 10000) ; 


w(:,iw)=w(:,iw)+ep*(x(:,k)-w(:,iw)) 
u  (iw)  -u  (iw)  +1; 


%  main  loop:  perform  once  for  each  data  vector 

%  calculate  distance  for  each  code  vector 

%  apply  fairness  function  to  distance 

%  find  minimum  distance 

%  determine  learning  rate  for  nearest 

%  code  vector 

%  update  weight  vector  for  nearest  code  vector 

%  update  number  of  wins  for  nearest  code  vector 


end 
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function  m=mse(x,w,m) 

%  program  lo  measure  mean  square  error  of  codebook 

\  input  variables  x  =   subject  data  set  arranged  into  vectors  of  appropriate  size 

%  w  =   weight  matrix  of  codebook  to  be  measured 

%  m  =   record  of  mse  for  previous  versions  of  code  book 

\ 

I  output  variable  m  =  updated  record  of  mse  measurements 

% 

%  local  variables  Nx  =  number  of  data  vectors    Caution:  Nx  must  be  >  nx 

%  N    vector  containing  the  size  and  number  of  weight  vectors  in  w 

%  y     ones  vector  used  to  set  up  comparison  of  distances 

%  d   =  vector  whicli  stores  the  distance  for  each  code  vector 

\  msel  =  accumulator  for  current  mse 

%  this  program  makes  a  single  pass  through  the  data  set  in  order  to  measure  the  mse. 
%  the  mse  is  then  appended  to  an  existing  vector,  m,  which  has  the  mse  record  for 
%    each  iteration  of  the  codebook 

N=size(w);  I  initalize  size  of  weight  matrix 

Nx^max (size (x) ) ;  %  initialize  number  of  data  vectors 

msel^O;  %  initialize  mse  accumulator 

y=ones ( 1, N  (2) ) ;  %  initialize  ones  matrix 

for  k=l:Nx  %  main  loop  :  execute  once  for  each  data  vector 

d=sum( (x ( : , k) *y-w)  .  "2) ;  %  calculate  for  each  weight  vector 

msel=msel+min (d) ;  %  increment  mse  accumulator 

end 

msel=msel/ (Nx*N ( 1 ) ) ;  %  normalize  mse 

m=!m, msel);  %  append  new  mse  value  to  previous  record 
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function  [ z,  mse ] =code ( x, w) 
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function  ( w, u ) =cbinit2 ( x) 

%    program  to  initialize  the  code  book 

%  input  variable  x  =  data  set  in  vector  format 

%  output  variable  w  =  initial  code  book 

%  u  =  initialized  frrquency  vector 

%  local  variables  N  =  desired  number  of  code  words 
%  Nx  =  number  of  data  vectors 

%  this  program  initializes  the  codebook  by  randomly  selecting  data  vectors  from  the 
%  subject  data  set.  It  also  sets  up  hte  initial  frequency  vector  for  the  codebook 
%  with  all  values  initialized  to  1. 

N=input ('  number  of  code  words    '); 

rand (' uni  form'  ) 
Nx-max (size  (x) ) ; 
for  i  =  l:N 

w  (:, i)-x  (:, ceil  (Nx* rand (1) )) ; 
end 
u=ones ( 1, N) ; 
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function  x=imgcon3 (y ) 

%  program  to  convert  an  image  in  vector  format  to  an  image  in  row 
\    format  with  an  arbitrary  vector  size 

%  input  variabie  y  =  subject  image  in  vector  format 

%  output  variable  x  =  subject  image  in  row  format 

\    local  variables  N  =  vector  which  stores  the  dimensions  of  y 
n2  =  height  of  block  input  by  user 
nl  =  width  of  block  input  by  user 
n3  =  size  of  desired  output  image 

nla  =  number  of  blocks  to  process  in  horizontal  direction 
n2a  =  number  of  blocks  to  process  in  vertical  direction 
k  =  index  to  track  number  of  blocks  processed 
il  -  vertical  placekeeper  in  subject  image 
jl  -  horizontal  placekeeper  in  subject  image 
z  ■  temporary  storage  for  desired  block 


N=size (y)  ; 

n2  =  input  (' height  of  block    '); 

nl=input (' width  of  block    '  )  ; 

n3  =  input  ( ' size  of  output  image 

nla=floor(n3/nl) ; 

n2a= floor  (n3/n2) ; 

x=zeros  (n3, n3) ; 

for  i=l : n2a 

il=  (i-1)  *n2+l 
for  j  =  l :  nla 

k=  (i-1)  'nla  +  j; 
jl-(j-l) *nl+l; 
z=zeros (n2, nl) ; 
for  l-l:nl 

m=  (1-1)  »n2+l; 
z (  :  ,  1) =y (m:m+n2-l, k) ; 
end 

x(il:il+n2-l, jl : jl+nl-1) -z; 
end 
end 


%  initialize  dimensions  of  input  image 
%  get  height  of  block  from  user 
%  get  width  of  block  from  user 

');   %  get  desired  output  image  size 
%  find  •  of  blocks  in  vert,  direction 
%  find  I  of  blocks  in  horiz.  direction 
%  initialize  output  image 
%  main  loop  :  move  vertically 
%  set  vertical  placekeeper 
%  inner  loop  :  move  horizontally 
%  update  number  of  blocks  processed 
%  set  horizontal  placekeeper 
%  initalize  temporary  storage 
%  loop  to  convert  vector  to  block 
%  find  section  of  vector  to  process 
%  get  segment  of  vector 

%  put  completed  block  into  image 
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%  program  to  initialize  the  codebook  for  TSVQ 

%  variables  xl,x2,...  =  input  data  sets  in  vector  form 

%  wl,w2, ...  ■  initial  code  books 

%  ul,u2,  ...  =  initialized  frequency  vectors 

*  N  =  desired  number  of  code  words 

%  lix  =  number  of  data  vectors  in  data  set  being  processed 

n=input (' size  of  input  vector    ' ) ;  %  get  size  of  input  vector 

N=input (' number  of  codewords    ');  %  get  desired  number  of  codewords 

nb=input (' number  of  branches  in  tree   ' ) ;    %    get  number  of  branches  in  tree 

%  this  program  constructs  the  code  book  initialization  for  TSVQ  by  randomly 
%  chosing  input  data  vectors  from  each  data  set 

rand (' uniform' )  %  set  up  random  number  generator 

for  p=l:nb        %  main  loop  execute  once  for  each  branch  of  tree 

eva  1  ([' Nx=size  (x' , int2str (p) ,');')) ;   %  get  size  of  current  data  set 
Nx    (2); 

for  i=l:N      %  inner  loop  :  choose  N  random  vectors  from  data  set 
m=ceil (Nx'rand ( 1, 1) ) ;     %  select  random  number 
eval (l'w',int2str(p),'  (:,q)=x',int2str(p),'  (:,m);'  )) 
%  place  selected  vector  in  appropriate  code  book 
end 
■eval(('u',int2str(p),'=ones(l,N);']);   %  initialize  frequency  counter 
eval ( [ ' m' , int2st r (p) , ' ■ [  );'));         %  initialize  mse  history 
end 
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%  program  to  sort  vector  for  tree  searched  code 

%    variables  Mx  =  number  of  data  vectors    Caution:  Hx  must  be  >  nx 

%  N  =    vector  containing  the  size  and  number  of  weight  vectors  in  w 

%  y     ones  vector  used  to  set  up  comparison  of  distances 

I  d    vector  which  stores  the  distance  for  each  code  vector 

%  md  =  minimum  distance  from  data  vector  to  a  code  word 

\  iw  =  index  of  minimum  distance  in  d 

%  x  =   subject  data  set  arranged  into  vectors  of  appropriate  size 

%  w  -      weight  matrix  of  codebook  to  be  used  for  sorting 

%  xl, x2,  .  .  .  =  data  sets  of  vectors  for  use  in  next  stage 

%  count  =  vector  to  track  size  of  output  data  sets 

%  this  program  performs  the  sorting  of  lite  input  data  set  for  use  by  the  sceond  \    level  of  t 

N=size(w);  %  initialize  dimensions  of  w 

Nx=max (size (x) ) ;  %  initialize  number  of  input  vectors 

mse=0;  %  initialize  mse 

y=ones ( 1, N (2) ) ;  %  initialize  ones  vector 

for  k=l:N(2)        %  this  loops  initializes  the  output  data  sets 
eval d'x' , int2str(k), ' -zeros (N(l) , Nx/4) ;'  1) ; 

count(k)=0;      %  initialize  size  of  output  data  sets 
end 

for  k=l:Nx  %  main  loop  :  execute  once  for  each  input  vector 

d=sum ( (x  (  : , k)  *y-w)  . ~2) ;    %  calculate  distances  for  each  code  vector 
[md, iw ] =min (d) ;  %  find  closest  code  vector 

count ( iw) =count  (iw) +  1 ;     %  update  size  of  output  data  set  chosen 
eval (['  x'  ,  int2str (iw) ,'(:, count (iw) ) -x (:, k) ;']) ;   %  update  output  data  set 
if  rem (k, 1000) ==0,  k,  end    %  update  progress  to  screen 

end 

for  k=l:N(2)     %  this  loop  truncates  the  output  data  sets  to  eliminate 
%  the  unused  portion  of  the  allocated  space 
eval U'x',int2str(k),'=x',int2str(k),'(:, 1: count (k) ) ;  '  ] ) ; 
end 
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%  program  to  code  an  image  for  tsvq 


%  variables  x  -  data  set  with  image  in  vector  format 


w  = 

weight  matrix  for 

wl,  w 

2,  . .  .    -  weight  mat 

wa  = 

weight  matrix  cho 

z   = 

approximate  image 

mse 

=  mean  square  erro 

N   - 

vector  containing 

Nx  - 

number  of  data  ve 

y  - 

ones  vector  used 

d  = 

vector  which  stor 

d2  = 

vector  distances 

md  ■ 

the  minimum  dista 

md2 

the  minimum  dis 

iw  « 

the  index  of  the 

iw2 

=  index  of  closest 

vector  quantizer  at  first  level 

rices  for  vector  quantizer  at  second  level 

sen  for  use  at  second  level 

produced  by  coding  in  vector  format 
r  of  approximation,  z 

size  and  number  of  weight  vectors  in  w 
ctors 

to  set  up  comparison  of  distances 
es  the  distance  for  each  code  vector 
Cor  code  book  at  second  level 
nee  contained  in  d 
tance  contained  in  d2 
code  vector  with  minimum  distance 

code  vector  in  level  two 


%  this  progrm  performs  coding  for  a  two  level  TSVQ.  The  input  and  output 
%  images  are  both  in  vector  format 


N  =  size  (wl )  ; 
Nx=max (size (x) ) ; 
mse=0; 

y=ones(l,N(2) ) ; 
z=zeros (N (1) , Nx) ; 


%  initialize  dimensions  of  w 

%  initialize  number  of  input  data  vectors 

%  initialize  mse 

%  initialize  ones  vector 

%  initialize  output  image 


for  k=l:Nx 

d=sum( (x( :,k) *y-w) .'2) 

[md,  iw] =min (d) ; 

eval ( ( ' wa=w' , int2str (iw) , ' 

d2=sum((x(:,k) *y-wa) . "2) ; 

Imd2,iw2]=min(d2); 

z ( :  ,  k) =wa ( : , iw2 ) ; 

mse=mse  +  sum ((x(:,k)-z(:,k))  .  A2 

if  rem(k, 1000)==0,  k,  end 
end 
mse=mse/ (Nx*N ( 1 ) ) ; 


%  main  loop  :  execute  once  for  each  input  vector 
%  find  distances  for  code  book  at  first  level 
%  find  closest  code  vector  at  first  level 

' ] ) ;   %  pick  weight  matrix  to  be  used  at  level  two 
%  find  distances  for  code  book  at  level  two 
%  find  closest  code  vector  at  level  two 
%  place  approximation  in  output  image 
%  increment  mse 
%  update  progess  to  screen 


%  normalize  mse 
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function  z=mssort (x, w) 

%  program  to  set  up  multi  stage  vq 

%  input  variables  x  =   subject  data  set  arranged  into  vectors  of  appropriate  size 

%  w  =   weight  matrix  of  codebook  to  be  used  for  sorting 

% 

%  output  variable  z  =  data  set  of  error  vectors  for  use  in  next  stage 

% 

%  local  variables  Mx  ■  number  of  data  vectors    Caution:  Nx  must  be  >  nx 

%  N     vector  containing  the  size  and  number  of  weight  vectors  in  w 

%  y     ones  vector  used  to  set  up  comparison  of  distances 

%  d     vector  which  stores  the  distance  for  each  code  vector 

%  md  =  minimum  distance  from  data  vector  to  a  code  word 

%  iw  =  index  of  minimum  distance  in  d 

%  this  program  takes  a  data  set  and  a  code  book  and  performs  on  pass  through  each 
%  data  vector,  finding  the  closest  code  vector  and  calculating  and  storing  the 
*  error.  This  new  data  set  is  used  for  the  next  stage  in  Multi  Stage  Vector 
%  Quantization. 

N=size(w);  %  initalize  number  and  size  of  weight  vectors 

Nx=max (size  (x) ) ;  %  initialize  number  ofdata  vectors 

y=ones (1, N (2) ) ;  %  initialize  ones  vector 

z  =  zeros (N  ( 1) , Nx) ;  %  initialize  error  data  set 

for  k=l:Nx  %  main  loop  :  execute  once  for  each  data  vector 

d=sum( (x ( : , k)  *y-w)  .  *2) ;  %  calulate  distance  for  each  code  word 

[md, iw) =min (d) ;  %  find  the  minimum  distance 

z ( : ,  k) =x ( : , k) -w ( : , iw) ;  %  calculate  and  store  the  error  vector 

if  rem (k, 1000) ==0,  k,  end  %  update  progress  every  1000  data  vectors 

end 
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function  [ z, mse ] =mscode (x, w, wl ) 

%  program  to  code  an  image  lor  ms  vq 


%  input  variables 


wl 


data  set  with  image  in  vector  format 
weight  matrix  for  vector  quantizer  at  first  level 
=  weight  matrix  for  vector  quantizer  at  second  level 


%  output  variables  z   =  approximate  image  produced  by  coding  in  vector  format 
%  mse  "  mean  square  error  of  approximation,  z 


local  variables  N 
Nx 


vector  containing  size  and  number  of  weight  vectors  in  w 

number  of  data  vectors 
x2  =  data  set  containing  first  level  error 
y   =  ones  vector  used  to  set  up  comparison  of  distances 
d  =  vector  which  stores  the  distance  for  each  code  vector 
d2  =  vector  distances  for  code  book  at  second  level 
md  =  the  minimum  distance  contained  in  d 
md2  =   the  minimum  distance  contained  in  d2 
iw  =  the  index  of  the  code  vector  with  minimum  distance 
iw2  =  index  of  closest  code  vector  in  level  two 


%  this  program  performs  coding  for  the  MSVQ  algorithm.  This  version  is 
%  to  a  two  level  architecture.  The  input  and  output  image  are  both  in 
*  vector  format. 


N=size  (w) ; 

Nx-max (size  (x) ) ; 

mse'O; 

y=ones  <).,N<2)  )  ; 

z=zeros (N (1) , Nx) ; 


%  initialize  dimensions  of  w 

%  initialize  number  of  data  vectors 

%  initialize  mse 

%  initialize  ones  vector 

%  initialize  output  image 


'2); 


%  main  loop  :  execute  once  for  each  data  vector 
%  find  distances  for  first  level  code  book 
%  find  closest  code  vector  on  first  level 
%  form  first  level  error 

%  find  distances  for  second  level  code  book 
%  find  closest  code  vector  on  second  level 
%  form  second  approximation  to  input  vector 
mse='mse  +  sum(  (x(:,k)-z(:,k)).*2);   %  increment  mse 
if  rem(k, 1000) ==0,  k,  end     %  update  progress  to  screen 

end 

mse=mse/ (Nx*N  ( 1) ) ;  %  normalize  mse 


for  K-l:Nx 

d=sum ( (x ( : , k) *y-w) 

(md, iw  J  =min (d) ; 

x2=x ( : , k) -w ( : , iw) ; 

d2  =  sum( (x2*y-wl)  .A2) ; 

lmd2, iw2 ] =min (d2) ; 

z ( : , k) =w ( : , iw) +wl ( : , iw2) ; 
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function  (x  1, x2 ) =classl (y ) 

%  program  to  convert  an  array  for  use  in  classified  vg 

%  input  variable  y  =  subject  image  in  row  format 

%  output  variables  xl  =  vector  format  array  of  edge  blocks 

x2  =  vector  format  array  of  shade  blocks 

%  local  variables  N  =  vector  which  stores  the  dimensions  of  y 
%  n2  =  height  of  desired  block  input  by  user 

%  nl  =  width  of  block  input  by  user 

\  nla  =  number  of  blocks  to  process  in  horizontal  direction 

%  n2a  =  number  of  blocks  to  process  in  vertical  direction 

%  k  =  index  to  track  number  of  blocks  processed 

%  il  =  vertical  placekeeper  in  subject  image 

%  jl  =  horizontal  placekeeper  in  subject  image 

%  z  =  array  used  to  evaluate  edge  detector  ratio 

%  this  program  takes  an  image  in  row  format  applies  an  edge  detector,  and 

%  outputs  two  data  sets  in  vector  format.  The  first  data  set  consists  of 

%  the  edge  blocks,  and  the  second  consists  of  the  shade  pixels. 


N^size  (y ) ; 

n2  =  input  (' height  of  block    '); 

nl^input (' width  of  block     '); 

nla  =  floor  (N(l) /nl) ; 

n2a=f loor (N(2) /n2) ; 

xl=zeros(nl'n2,nla*n2a) ; 

x  2  -  x  1 ; 

count  1  =  0; 

count2=0; 

for  i-1 : n2a 

il=(i-l) *n2+l 
for  j  =  l : nla 
k-'(i-l)  *nla+ j; 
jl=(j-l) 'nl+1; 
z=y(il:il+n2-l,  jl:  jl+nl-1) ; 
zl=z(:) ; 

z2(l)=(zl(l)-zl(2))/max(zl(l),zl(2)) 
z2(2)=(zl(l)-zl(3) )/max(zl(l), zl(3) ) 
z2(3)  =  (zl(l)-zlC1))/max(zl(l),zl(4)) 
z2Cl)  =  (zl(2)-zl(3)  )/max(zl(2),zl(3)) 
z2(5)-(zl(2)-zl(4)) /max(zl (2), zl (4) ) 
z2(6)=(zl(3)-zl(4))/max(zl(3),zl(4)) 
if  max (abs (z2) )  >  0.4 
count l=count 1+1, • 
xl ( : , count  1) =zl; 
else 

count2=count2+l; 
x2  (  : , count2) =  zl; 
end 
end 
end 

xl=xl ( : , 1 : countl) ; 
x2  =  x2  (  :, 1 :count2) ; 
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